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Abstract We prove the convergence of greedy and randomized versions of Schwarz iterative 
methods for solving linear elliptic variational problems based on infinite space splittings of a 
Hilbert space. For the greedy case, we show a squared error decay rate of 0{{m + 1)^*) for 
elements of an approximation space related to the underlying splitting. For the randomized 
case, we show an expected squared error decay rate of 0((m+ 1)^*) on a class C M 
depending on the probability distribution. 
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1 Introduction 

The aim of this paper is to extend convergence results for greedy and randomized versions of 
multiplicative Schwarz methods for solving elliptic variational problems in Hilbert spaces from 
the case of finite space splittings I11II211 to the case of infinite space splittings. Let V be a 
separable real or complex Hilbert space with scalar product (•,•), let a(•, ■) be a continuous and 
coercive Hermitian form on V, and let F be a bounded linear functional on V. Note that a(-, •) 
induces a spectrally equivalent scalar product on V, in the sequel we write 14 to indicate that we 
consider V with this new scalar product, and use the notation || • Ha for the induced norm. Then 
the variational problem 

(A) Find m 6 V such that 

a{u^v)=F{y) Vv 6 V, 

possesses a unique solution, and is equivalent to the quadratic minimization problem 

(B) Find the minimizer m 6 V of the quadratic functional 

<!■(«) := —a{u,u) — F[u). 
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We treat the problem in the form (A), by turning it into an infinite linear system using space 
splittings as described next. The equivalent formulation (B) provides the link to convex op¬ 
timization, where algorithms similar to the ones considered here are known and investigated 
under the name block-coordinate descent methods. 

For the separable Hilbert space Va, we consider space splittings generated by families of 
Hilbert spaces 14_ (with scalar product and norm || • Ha;) and bounded linear operators 

Ri '■ Voi Va, i G R such that the span of the subspaces ^,14. C 14 is dense in I 4 . Here, the index 
set I can be finite (/ = {1,2,..., N}), or countable (7 = N). These conditions on a space splitting 
are silently assumed throughout this paper. We call a space splitting stable, and write 


14 = := 

iei 



YRi^i ■ 

iei 



( 1 ) 


if 


0 < 


:= inf 

uev,, 


a{u,u) 

w 


^ RmELX ■— sup 


a{u,u) 


m 6 K 


IIHIP 


< 


(2) 


where 


lll“lll^:= inf 


A/)- 


Not every infinite space splitting is stable, in particular, in (O it is assumed that every element in 
14 possesses at least one converging series expansion with respect to {7?,14j which follows from 
@ and the assumed density of span({7?/14.}) in I 4 . The constants A^in and are called lower 
and upper stability constants, and K := B/A is called the condition of the stable space splitting 
(O, respectively. A prominent case of stable space splittings are frames and fusion frames, 
see I4ll^ ll8l[l9l . In all these definitions, we allow for redundancy, i.e., 7?, 14. n7?y\4^. = {0} is 
not required for i ^ j, and we do not assume that the 14 . are closed subspaces of 14 . 

For the setup of Schwarz iterative methods we need to define operators 7^ : 14 14_ via the 

variational problems 

ai{TiV, Vi) = a{v,RiVi) V v; e 14,., (3) 


to be solved for given v e I 4 on the spaces I 4 ,, i e 7. Evaluating TjV is equivalent to solving a 
variational problem in I 4 ,, and it is silently assumed that this is easier than solving the original 
problem (A). This stems from the fact that I 4 , has typically much smaller dimension than I 4 
and/or the Hermitian form «,(■,■) leads to a linear system with better spectral properties or 
simpler structure. If the underlying space splitting is finite then, using these 7], analogs of the 
classical Jacobi-Richardson and Gauss-Seidel-SOR iterations, called additive and multiplicative 
Schwarz methods with respect to (stable) space splittings can be defined and investigated, pretty 
much along the lines of the standard methods, see (iiiiilliiiioiiM). 

Here we formulate a generic version of the multiplicative (also called sequential or asyn¬ 
chronous) Schwarz method with relaxation suitable for the case of infinite space splittings. 
Choose an initial approximation and repeat the following steps for m = 0,1,... until a 
stopping criterion is met: 

1. Subproblem pick and solution: Given the current and an index set 7^ C 7, choose an 
index i = i,„ e 7^ (according to some rule to be specified), and compute the partial residual 
:= where := u — Although u is unknown, this can be done since the 

right-hand side in the corresponding subproblem @ reads 


a{e^”'\RiVi)=FiRiVi) 




RiVi) 


and does not depend on knowledge about u. 

2. Linear update: Determine relaxation parameters 0 !,„ > 0 and a)„, (according to some rule 
to be specified), and set 




(4) 
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So far, this is a theoretical algorithm since executing Steps 1 and 2 is not feasible without fur¬ 
ther specification and assumptions. In the case of finite splittings, the Schwarz iterative method 
figures also under the name alternating directions method (ADM), see p8| for references. 

As to Step 1, we need to specify the rule for picking the next index i = i„,. There are at least 
three standard versions to be considered: 

- Deterministic orderings. In this case, we choose an index sequence beforehand. In 
the case of finite splittings, the default orderings are cyclic {irN+k-i = k) or symmetric- 
cyclic (i 2 rAf+i:- 1 = k, hrN+N+k-i =N+l—k)fork= 1,..., A, r = 0,1) which corresponds 
to the classical SOR and SSOR methods, respectively. A naive deterministic ordering for 
infinite space splittings would be to choose {1; 1,2; 1,2,3; 1,2,3,4;...}. For finite splittings, 
convergence is known for cyclic orderings from the ADM theory (compare, e.g., il) , see 
also the convergence rate estimates for Schwarz iterative methods in Qol and, more recently, 
for coordinate descent methods and convex optimization problems (3l. 

- Greedy orderings. The idea goes back to Gauss and Seidel, and was popularized by South- 
well (mill (the corresponding algorithms for finite splittings are often called Gauss- 
Southwell methods). Here the decision for the next index i,„ depends on the current iter¬ 
ate and aims at maximizing the error reduction in the next step. For instance, we can 
require i^ 6 Im to satisfy 




i€lni 


(5) 


where /3,„ e (0,1] is called weakness parameter. This approach is expensive, as it involves the 
computation of multiple partial residuals, at least approximately, just to pick the next index. 
Most of the research on quantitative convergence results for greedy orderings and infinite 
splittings (see dD for an overview) is devoted to the case /„,=/ = N, where finding an 
im that satisfies © in a numerically feasible way can be guaranteed only under additional 
assumptions. In practice, one would prefer working with dynamically growing but finite 
index sets C I. In the case of finite splittings, algorithms with greedy orderings have been 
analyzed in the more general setting of convex optimization methods, see e.g. I141l331l34t 
for early results in this direction, for a short proof in the case of problem (A), see dll. 

- Random orderings. Choose a sequence of discrete probability distributions 

^(«0 = > 0},g,, 

and pick i = im G I randomly according to m = 0,1,.... Even in the case of finite 
splittings, a theoretical analysis of such algorithms has been started only recently but it re¬ 
vealed that they are competitive with the best (often unknown) deterministic orderings, and 
numerically much cheaper than greedy orderings. We refer to fTIl[l3ll2l]l25l for the setting 
of the present paper (quadratic minimization as in (B)), and to I15II16I22II26I for recent con¬ 
vergence results on block coordinate descent methods for large-scale convex optimization 
problems. 

Certainly, there are many more variants to explore. In this paper, we concentrate on greedy and 
random orderings for infinite splittings (7 = N). 

As to Step 2, many options have been discussed in the literature, especially in connection 
with greedy orderings, see mm for an overview and references. 

- The simplest algorithms result if we fix both parameters a,„ = 1, a)„, = co independently 
of m, and assume some normalization condition for the 7?,. Then we arrive at analogs of 
the algorithms discussed in the theory of greedy methods under the names weak (WGA) 
and weak relaxed (WRGA) greedy algorithms. Their counterparts for jS = 1 are called pure 
(PGA) and relaxed (RGA) greedy algorithms, respectively. 
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- More generally, one can try to find a„, > 0, CO„, simultaneously by minimizing the new error 
term 

IIm —|L = min IIm—(6 ) 
a>0.(o 

with respect to a and co. This is equivalent to solving a two-dimensional quadratic min¬ 
imization problem. Various restrictions have been proposed, e.g., a -|- CO = 1 is a popular 
choice, see the early work II1 I12I on relaxed greedy algorithms, and f7l l321 in a slightly 
more general setting. 

- In this paper, we consider a variant with fixed parameter sequence 

0!,„ := 1 - (m + 2)^'e (0,1), m>0, (7) 

and with co,„ determined by minimizing the error term ® with respect to co. More explicitly, 
a{u— amU^'"\R 


where a^, := (1 — a,„) = (m -|- 2) *, m > 0. In the theory of greedy algorithms, this version 
is labeled GAWR, see ll2l l30l . 

Further modifications are listed and investigated in I30II31I . for extensions to convex optimiza¬ 
tion problems on Hilbert and Banach spaces see the recent papers I7l ll7ll32ll36l . We also men¬ 
tion that instead of finding an optimal approximation only from span({f?,r;™\ 

as done in one could include earlier approximations k < m, into the local search in 

Step 2. Orthogonal matching pursuit is a relatively expensive extension of this type. A less 
expensive version motivated by the conjugate gradient method would be to find from 

span({f?,rj'”\n('”^ — 

The main contributions of this paper can be summarized as follows. For the greedy case, we 
restrict ourselves in (l5]l to 

/3„, = jSe(0,l], /,„ = N, m>0, 

and give a convergence proof for the above specified theoretical algorithm. This proof is, in the 
case of the Hilbert space setting, a modification of the approach used in l2ll30ll . In particular, we 
show the convergence rate 

e„, := \\u-u^"''^\\a = 0 ((m+l)^'/^), 

for u from the class which will be defined below. This class appears naturally in all investi¬ 
gations on greedy algorithms, and the exponent 1 /2 in the convergence rate estimate is known 
to be optimal in our considered situation of general space splittings IQ. 

For the case of random picks with a fixed probability distribution = ;r > 0, we prove a 
similar estimate for the expected error decay, 

£,n ■■= = 0((m-|- 1)^'/^), m ^ oc, 

for M from a smaller class C M ■ To the best of our knowledge, this is the first general con¬ 
vergence result for randomized Schwarz iterative methods in the case of infinite splittings. Using 
an approximation and density argument, we show convergence in expectation (without guaran¬ 
teed rate) also for arbitrary m e 14 and for sequences that converge to a fixed probability 
distribution ;r > 0 in f *. 

Although mathematically not difficult, we emphasize that our approach via space splittings 
(rather than expansions with respect to dictionaries ^ C V and updates along one-dimensional 
search directions) covers block-iterative methods, auxiliary space techniques, and outer approx¬ 
imation schemes which may lead to a broader applicability of our theoretical findings. 

The remainder of this paper is organized as follows: In Section[^we present our convergence 
results. To this end, we first introduce approximation spaces associated with a given infinite 
space splitting, and give some preparatory lemmata. Then, we prove the main theorems on 
convergence estimates for greedy and randomized Schwarz iterations. In Section[^we discuss 
some further results and consequences. 


S'”) 




_(m) 




11 ^/'- 


(m) 


( 8 ) 
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2 Convergence Results 

2.1 Approximation spaces related to space splittings 

Throughout this section, set / = N, and fix the families of auxiliary Hilbert spaces and 

: Vo, of bounded linear operators. Furthermore, let them be such that the span of 

the linear subspaces RiVaj is dense in 14 - We assume uniform boundedness of the operators /?,, 
i.e., there exists a constant A such that 


=a[RiVi,RiVi) < A^a;(v/,V/) = A^||v,-||a., V/ e Vat, i e N. 


(9) 


In theory, this can always be achieved by rescaling either or the auxiliary Hermitian forms a,. 
For the practical application this is however irrelevant as the minimization with respect to co in 
the update Step 2 of the algorithms automatically takes care of this. Unless stated otherwise, we 
will not assume that O is a stable space splitting for Va- 

For any non-negative weight sequence 7= {7},gN ^ 0 < ^ < 0°, introduce approxi¬ 

mation spaces = s^q ({14;,/?,; 14}) as follows: For u 6 span({/?,14,},Gsupp(7))> define 

i llullfliliG/lk-? : Vi e 14; u = Y^RiVi (with finite I C supp( 7 )) 

' I 

and introduce as the completion with respect to this (quasi-)semi-norm. For the weight 
sequence 7 = 1, we drop the superscript 7 from the notation. The cases we are most interested 
in are 


c M c 


f K, 

I 


l<q<2, 


where ;r > 0 is any given discrete probability distribution, i.e., Ki > 0 and = 1 - The 

embeddings are continuous. If 0 is a stable splitting, then obviously = 14 , and all spaces 
and £/q, ^ < 2, are subspaces of I 4 . They are also dense in 14 (for under the assumption 
that supp(;r) = N). For the case of one-dimensional subspaces of 14 generated by a countable 
dictionary, these definitions are standard and have been instrumental for setting up a quantitative 
convergence theory for greedy algorithms with infinite dictionaries, see t27ll29ll . 

The following technical lemma is crucial for our convergence proofs below. 

Lemma 1 For the underlying space splitting, assume a. For any e £ I 4 , denote r; = Tie 6 I 4 ., 

Wi = WnWai^Rifi e 14 , i e N. 

a) Ifi* is such that ||r/* ||a, > jS sup^gp^ \\ri\\a^ for some 0 < jS < 1, then, for any nontrivial h(zJ^\, 
we have 


-a{e,h). 


Ik,'*||a;. =a{e,Wi-) > - 

b) If 71 is any discrete probability distribution, then, for any nontrivial h £ we have 

E ^iWiWa, = E ^i^{e,Wi) > — a{e,h). 


( 10 ) 


( 11 ) 


Proof. Since r,- = fe is the unique minimizer of the associated quadratic minimization prob¬ 
lem, i.e., 

-affe, Tie) - a{e,RiTie) < -a,(v;, v,) - a(e,RiVi) V v; £ 14,., 
for any index i, one concludes for any v,- £ 14 . with ||v, ||a; < ||r,||fl; that 
Iki'lla; =a{e,RiTie) = \\ri\\a-a{e,Wi) >a(e,RiVi). 

If r/ 0 , after dividing by ||r,||a;, this yields 

Ik/lla; = a{e,Wi) > a{e,RiVi) Vv,- £ I4. : ||v/||a; < 1. 
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This inequality holds for r, = 0 as well, since in this case 0 = a,(7;e,v,) = a(e,^,v,) for all 
Vi e Va-. Since by 

ieN igN iGN 

convergence in ji/i obviously implies convergence in Va, we can assume that for any e > 0, there 
exists a finitely representable 

h = Y,CiRiVi (12) 

i^V 

(with finite /' C N and ||v, ||flj. < 1) such that 

A-^\\h-h\\a<\\h-h\\^i<e, ^|c,.|<||/,|U^+e. 

iel' 


For Part a), we thus have 


a(e,w,*) > jS supfl(e, w,) > jS sup a{e,RiVi)> 


II^IUi +£ 


a{e,h), 


and letting e 0 implies ( llOt . 

Similarly, in Part b) we can choose h of the form lll2b such that 

A^'^\\h-h\\a < \\h-h\\^K < e, |c,-| < Ki{\\h\\^^ + e). 

Then, by the same reasoning 

y7tia{e,Wi)> sup YKia{e,RiVi)= sup a(e, TRiVi) > n. ^ , a(e,h). 

!GN ll'’iH»,<li'GN iGN ll«l|i!C + £ 

With e ^ 0, we get lllll i which finishes the proof of Lemma 1. □ 

Below, we will apply this lemma with e = := u— u^”'\ i.e. the error after m steps of the 

algorithm, and with jS = j6,„, i.e. the weakness parameter in the case of greedy orderings, while 
n coincides with the probability distribution used to create random orderings. 

As another preparation, we formulate an auxiliary result for approximation in spaces 
that will allow us to work with variable probability distributions (see also Remark 4 in Section 

H. 

Lemma 2 Assume that Tt > Q is a fixed probability distribution with support N, and assume 
that >0 is a sequence of probability distributions that converges to 7t in the (} norm. For 

the underlying space splitting, assume m- Then, for any given h £ there exists a sequence 
of finitely representable 

him) — ^ /{">) finite, 

ieli"') 

such that for m>0 

(l+3A)||h|U^.|k-^^'”^ll^i, \\h^"‘^\U<nh\\^n. (13) 

Proof. Since h £ for given m>0 and 5 > 0 there is a finite /' and a 

h=yRiVi, ||v;||a,. < (l + 5);r,- 11 / 111 ^^^, iel', 
iei' 


such that 


\\h-h\\a<\\h\\^n\\n-n^'”\\p. 
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In the definition of set /W = {ien : > cTT/} n I' with a constant c to be fixed below. 

Then, obviously, we have 

\\h-h('"^\\a<\\h-h\\a+ Y. \\Rm\\a<\\h-h\\a+A Y hi\U 


< ||;r-;rW||,,+(i + 5)A Y 


But for i 6 we have (1 — c);r,' < it,- — thus 




1 + 


(l + g)A 
1 — c 


On the other hand, by the definition of h and h^”'^ 

||/i('”)||^,,W<max^<||h|U 
Choosing 5 = c = 1/2 gives then the statement of Lemma|2l 


1 + 5 


□ 


2.2 Convergence Estimates 


As in the case of finite space splittings im, our convergence proof of the Schwarz iterative 
method for infinite space splittings is based on the same error representation for both greedy 
and random orderings. We therefore state the core estimates together in one place. 

Theorem 1 Consider an infinite space splitting consisting of auxiliary Hilbert spaces Va^ and 
bounded linear operators Ri : 14 ^ 14 , i G N, such that span({RiVai}i<^n) A dense in 14 and 

m holds. Furthermore, consider a Schwarz iterative method for the variational problem (A) 
with starting approximation u^^'i = 0 and update rule 0. where the parameters and (0„j are 
specified by m and respectively. 

a) Assume that the update indices i = i^ are chosen according to the greedy rule 0 with 4, = N 
and weakness parameter /3,„ = jS G (0,1], m>0. Ifu G then the squared error decay is given 
by 

||m-mW||2 < 2{\\u\\l + {Al^f\\u\\\){m+\)-\ m>Q. 

b) Assume that the update indices i = i„, G N are chosen randomly and independently according 
to a fixed discrete probability distribution TZ, m>0. If u E then the expected squared error 
decay is given by 

< 2(||n||«+A2||M||^^^)(m+l)-', m > 0. 


Proof. We derive a recursion for the (expected) squared error. Suppose that is deter¬ 
mined, and that the i-th subproblem solution := is used for the update to 

according to @. Thus, we can write 


”+') ■= u- 


'.— U — (^GCfj^U^^ + Oj 


r» (fn) \ 

I ) — 






where a„, = I — am, and, in agreement with the previous subsection, = \\ri\\a^Rirf'\ The 
parameter i§/ „, = di^mlotm is found by solving the minimization problem 


||e('«+i)||2 = min| 


e^”'^ + am{u-^w\ 






8 


M. Griebel, P. Oswald 


Thus, for any ^ and any chosen z, we have 


»+i) I 


I < ami 




{n 


Using the triangle inequality and the norm in the last term can be bounded, independently 
of i, by 


u-^wr\\i<2i\\u\\t + ^^\\rr\\-,^\\Rir] 


{n 




For dealing with the term a(e^”'\u) — we invoke Lemma[T] In the case of 

greedy orderings, we take h = u E in Part a) of Lemma|T]and choose ^ = jS^* ||m||j 3 'i ■ We 
then arrive at 

ll>+'^ll^<a^lkW||2 + 2a2(||n||2 + (A/jS)2||M||^0- (14) 

Denote Cm '■= (m+ and M := 2 (||m||^ + (A//3)^||m||^^ ). Substituting the concrete val¬ 

ues of am = 1 — (m + 2)^^ and a,,, = {m + 2)^* into (114b . we obtain 


t^m+i a;j;Cm a^^Af, tn^O. 

Since cq = ||^ = ||m||^ < M, this implies Cm <M for all m > 0, and proves the result in Part 

a) of Theorem [T] 

In the case of random orderings, we now use Lemma [T]b) for the given discrete proba¬ 
bility distribution n with h = u E This yields the following estimate for the conditional 
expectation of ||e(™+*) ||^ with respect to given «(”’) valid for any > 0: 

£(||,(-+i) ||21 „(m)) < ii^w ii2 + 2a„a„Me^'^\u) - ^ w^)) 

i 

+2al{\\u\\l + ^^A^) 

<a2||eW||2 + 2amama(t.W,u)(l-^||M||^;j) + 2a2(||„||2 + ^2A2). 


Thus, hxing ^ and taking the expectations with respect to «('”), we get 


£(lk(”+'^||^)<a2£(||eW||2)+2a2(||„||2+A2||„||^^.). 


(15) 


This gives Part b) of Theorem[T]if we argue as before. □ 

Using a density argument as in I2ll30ll . one can extend the estimate of Theorem[TJ and show 
convergence for all « e 14 . 


Theorem 2 Under the same assumptions as in Theorem\J] we have convergence and expected 
convergence m)™) ^ u in Vafor the greedy and random Schwarz iterative methods, respectively, 
with no additional assumptions on the solution u EVa of the variational problem (A). 

More precisely, for the greedy version specified in Part a) of Theorem\J]and any h ^ s^\, we 
have 


2('”^||«<2||u-h||,+ 


8(||«||2 + (A/jS)2|| 


11^.: 


{m+ 1 )V 2 


m > 0. 


(16) 


For the random version specified in Part b) of Theorem\I\and any h £ we have 


E{\\u 


mW||2)'/2<2||„-/,||,+ 


^8(||m|| 2+A2||/,||^^,) 
(m-f if 12 


m>0. 


(17) 
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Proof. We start with the greedy case, take any /t 6 Repeat the proof of Part a) of 
Theorem[T] When invoking Lemma[T]a), use it with h instead of u, and set ■ Then 

ll"IUi 

and ( 114b can be replaced by 

If ||e(“+')||,>a,„||eW||^,then 

| k ('«+ i )||2 


„(">+i) I 


< 


11*- \\a 


Substituting the previous estimate of this yields 


„(m+l) I 


a < +2a„,\\u— h\\a + ■ 


alM 


a,n\\e' 


(18) 


'm e- - a 


where M = 2(||m|| 2 + (A/jS)2||/r||^^). If, alternatively, ||a < a«,||e('”l||a, then (fTSt holds 

trivially. The inequality (El is complemented by 

||e(”’+')||„=inf||a„,eW+a„,(M-^wJ“))|U<a„||eW||, + a^^^ (19) 

In the random case, we proceed similarly. For any h £ we apply Lemma 1 b) with 
^ 0- This shows 

I'eN 

and, instead of El, we obtain 

£(||.(“+')||2)<a2£(||e(-)||2) + 2a,„a,„Z?(||r.W||,)||^ 

where M := 2(||m||2 + a2||/i||^,i). Using the notation e„, := Zi(||e('"l ||2)*/2 gjjj (j^g obvious in¬ 
equality < e,„, by the same reasoning as above, we arrive at the following replace- 

alM 


ment for El: 


^m+1 ^ H“ 20f;;7||w “h 




( 20 ) 


To obtain a complementary estimate analogous to < fT9l ). by definition of we can write 

11^ < 11^ < 11^+ 2a„a.||H 

Then we take expectations on both sides, use again Z?(||e('"(||fl) < e,„, and obtain 

£m+i < (a2g2+2a,„a„,||M||a£'(||e('")||a) + am||M||a)'/^ < a,„Em + a,„\\u\\a. (21) 

Up to different constants M and M, the recursive inequalites dlSIllQt for the sequence {£„,} 
and ( 1201121b for {£,„} are identical. Therefore, it is enough to consider the random case. Set 
l)'/2M-i/2(e„,_2||M-/,||,) , m>0. Then, a quick calculation shows that (121b turns 

into 

bm+i < + -—-JTJ, OT > 0, (22) 

(»j + 2)‘/^ 

where B := ||M||aM^*/2^ while under the assumption > 0 the inequality ( 120b implies 

bm+l ^ (^m {bm + -, - ,,, )■ (23) 

(m+ l)fc„, 
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A similar recursive system of inequalities has been considered in f2l l30l . Lemma[3]stated below 
implies < 2 for all m>0 (note that bo < B < 1/ \/2). This yields 

iVm 

em<2\\u-h\\a + - -—-jTy, m>0, 

[m+ \ 

and proves (Ell. The estimate ( 116b is derived in complete analogy. 

To show convergence for arbitrary m e 14, for given e > 0, choose h E by the density of 
M in 14 such that ||m —/7||a < e/3. Then, with this h fixed, the second term in the right-hand 
side of ( 116b will become < e/3 for all large enough m as well. This proves convergence for the 
greedy version. An analogous argument shows E{\\u— 0 as °° for the random 

case. Theorem|2]is fully established. □ 

For the convenience of the reader, we conclude this section with the short proof of the 
boundedness of sequences {b„j}„,>o satisfying the recursion ( 122123b used in the proof of Theo¬ 
rem 

Lemma 3 Suppose, a sequence {bm\m >0 satisfies the inequalities l l22C and l l23b . where a„i = 
(m+ l)/{m +2) and B > 0. Fix a constant A > B/sPl + \p2.. Then bo < A implies b„i < A for 
all ni > 1. In particular, ifB <1/\/2 one can choose A = 2. 

Proof. We use induction in m. Assume b^ < A. For a value t = t,„ > 0 to be fixed below, we 
consider two cases. If b^ < t, by (122b we have 

B 


1 /2 

^m+1 tXjfi t + 


if 


t < a,„ ' {A - 


( m -|- 2)*/2 
1 / 2 /. B 


<A, 


(m-|-2)*/2' 

On the other hand, if f < (?„ < A we use ( 123b which gives 


(24) 


if 


,1/2 


t > 


(1 — al/^){m+ 1)A 


bm+l f; (An (2l + - .. , ) < A, 

{m+ l)t 


(m + l)i/2-|-(m + 2)i/2 _ (m+1)^/^ + (»J + 2)l/2 


(m + 1)1/2A 


(m-|-2)i/2A 


It is easy to see that the choice t = t„, := 2am ^^^/A > 0 satisfies both (124b and ( 125b if 2/A < A — 
B/\/2. The latter follows from the assumption on A which implies A > A — Bjspl > \p2 > 2/A. 
This finishes the induction step, and proves Lemma[3] □ 


3 Further Results and Discussion 

Remark 1. Our results for the expected error decay for random orderings imply immediately 
estimates in probability. Using the Markov-Chebyshev inequality, under the assumptions of 
Theorem[T]b), we get 


■(II 


PI IIm— M 


(m)||2 


> 1 - 


8(II«IL"+A^||«||^^.) 


{m+ l)e2 

for any error threshold e > 0, or, equivalently, 

8(l|M||fl+A^||M||^^) 


P ||M-n('«)||2< 


I -l-1 )5 


>1-5, 


m > 0, 


m > 0, 
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for any confidence level 5. An investigation of the variance or other higher-order moments of 
the squared error that could lead to improved estimates has not been undertaken yet. Numerical 
experiments with randomized Schwarz iterations for finite splittings imiiii suggest that the 
variance is reasonably small in practice. 

Remark 2. If in addition to our assumptions on space splittings we assume that 0 is a 
stable space splitting of I 4 , i.e., if = I 4 holds, then, for u 6 ji/q, I < q < 2 and the greedy 
version of the Schwarz iterative method specified in Part a) of Theorem [T] we have the error 
decay rate 

m>0, (26) 

where C is some absolute constant depending on jS and the upper stability constant Amax only. 
This can be established by an interpolation argument along the lines of <2l30l , where the authors 
consider the special case of splittings into one-dimensional subspaces Va^ = : A/ 6 K} of 

Va induced by a dictionary ^ of unit norm elements xj/j 6 I 4 such that its span is dense 

in Va - For this case, the estimate ( 126b may be replaced by a similar statement for u 6 SSg, where 
SSq, 1 < < 5 : < 2, is obtained by real interpolation for the pair (MiK). In the special case, when 
1^ is a frame in I 4 and thus 14 = the scales s/g and coincide for 1 < ^ < 2, in general, 
they are different. 

Remark 3. In ( 291 , weaker error estimates for other greedy algorithms such as PGA and 
WGA can be found. We believe that their proofs carry over to the setting based on space split¬ 
tings without difficulties. E.g., for the PGA with a„, = jS = 1, we expect 

||u — ||a < , m>l, uE^i, 

to hold, see |6l for the PGA in the dictionary case. Whether the exponent 1 /6 can be increased 
to 1 /2 under the assumption that Q is a stable space splitting is an open problem, even when the 
space splitting comes from a frame. Slightly better exponents are possible for PGA and WGA, 

see i29ll3ll . 

Remark 4. Theorems [T] and |3 provide convergence guarantees under theoretical assump¬ 
tions that look still questionable from a practical point of view: The question of rounding errors 
is not addressed, for the greedy version the condition (0 needs to be checked for an infinite 
index set = N, while in the random Schwarz iterative method drawing the next index i = 
according to a (rather general) discrete probability distribution n defined on N seems incon¬ 
venient as well. For greedy algorithms based on dictionaries, there are partial results in this 
direction I^l7l l28ll which can be adapted to the case of space splittings considered here. 

We concentrate on the randomized version. When combined with Lemma|2l the estimation 
techniques leading to Theoreml^give the following result which, in particular, allows us to work 
with finitely supported probability distributions that converge to a desired k>Q sufficiently 
fast, without sacrificing convergence speed. 

Proposition 1 Assume that the indices i = i^ in the random Schwarz iteration are chosen using 
discrete probability distributions > 0 such that 

-7i\y <D{m + 2)-^^^, m>0, (27) 

for some TT > 0 and some constant D > 0. Then, assuming the remaining conditions of Theorem 
\I\Part b), we have the estimate 

E{\\u-u^'’'^\\l)^l^ <2\\u-h\\a+C{m+\)-^l^, m>0, (28) 

for any h 6 with some constant C depending on A, \\u\\a, and the constant D in 
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Proof. We repeat the same steps that lead to ( 120b . with the following changes: The con¬ 
ditional expectation is now computed with respect to For estimating 

the difference we use the e whose existence is 

guaranteed by Lemmaj^ and set ^ ||^^(ni) • This gives 

i 

< ||eW||„(||u-/7||„+D(l+3A)||/r|U^.(m + 2)-i/2), 

where we have substituted (El and | |27] (. Using as before the notation 6„, = E{\\e^"''> and 
the obvious inequality Zi(||e("‘)||a) < e„„ we arrive at the following replacement for (120 b : 

£m+l < + —/lllfi!+Co(m + 2) H-, (29) 

where as before E^n '•= The constants are Co = 20(1 +3A)||/7||^j, and 

see ( 113b . This gives a recursion for 

bm ■■= (m-l- l)'/^Cj^'^^(em-2||M-h||a), m> 0, 

similar to ( 123b but with a new term induced by the additional term 2Co{in -|- 1)^^/^ in ( 129b : 


1 /2 

bm+\ ^ {,bm T 


) + ■ 


(m-|-l)h,„ m + 2' 


B = 


Co 


r 

*-1 


1 / 2 - 


(30) 


This relation is again complemented by the inequality (122b . this time with the constant B = 
||M||flC[ < 1 /\/2. Since repeating the proof of Lemma^with the additional term in the right- 

hand side of ( 130b does not represent any difficulty, we leave it to the reader to show that b,„ < A, 
m > 0, holds for some new constant A depending on B and B. This shows ( 128b with C = AcJ^^, 
and finishes our sketch of the proof of Proposition [T] □ 

Remark 5. In the generality considered here, the obtained convergence rates for the error 
||m — u'"‘'^\\a are not very impressive but unfortunately cannot be improved much. For greedy 
orderings, this issue has been addressed in l6l l27ll^ . We add some comments for random 
orderings. Consider the very special situation of a one-dimensional subspace splitting induced 
by a complete orthonormal system ^ in F with a(-, •) = (•,•)’ ^^d the problem of incremental 
approximation of a given u e V by linear combinations of elements from Si. I.e., if 


j'eN 


is the unique orthogonal decomposition of u with respect to S, then we have = Ci\j/i. Fix 
the discrete probability distribution 7Z > 0, and consider the associated randomized Schwarz 
iterative method with updates of the form (0. It is easy to find that, due to the orthogonality of 
the splitting, the best expected convergence rate is achieved for a,„ = 1. In that case, we have 

^ c.y,. ||u-uW||2= |c,.p, 


with probability riiLo* Pk’ where {4}*:>o is the random index sequence, and /('") is the set of 
the first m such indices (/('"( may have cardinality < m, repetitions are possible). We leave it to 
the reader to verify the identity 

ieN 


m > 1. 
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In the particular case considered, the formula confirms the statement of Theorem l^b): Since 
Tti > 0 for all i E N, we have E{\\u— 0 as m oo, i.e., the expected error converges 

to 0 for any u eV and any probability distribution 7Z > 0. 

On the other hand, when inspecting the statement of Theorem[T]b) for our case, we see that 
u E is equivalent to the inequality 


|c;| < ClT/, ieN, C:=\\u\\^n<ca. 


Thus, for u E we have 


£(||u- nW IP) < ^ 7tf{\- n>r\ m>l, 

i'gN 


(31) 


which is sharp in the sense that equality holds (simultaneously for all m > 1) if we set c; = CKi. 
Since (j){t) = t^{l — t)™ < c/(m+ 1)^ for f 6 [0,1] for some absolute constant c and all m > 1, 
we get 


c 1 +c 

< 


V ;rf(l - < T nf + T , ^ . 

(HI -l- 11 ^ fH -\- 1 

iGN Ki<(m+\)-' ^ ^ 


Indeed, the first sum can be estimated according to 

1 


Y nf < -- y ni< , 


and the second is a finite sum with < m terms. This result is in line with the bound of Theorem 

m 

No substantial improvement of the decay rate 0{{m+ 1) ’) can be expected for general 
probability distributions: For each fixed m, taking n sufficiently close to the uniform distribution 
on {l,...,m+ 1} provides a lower bound of E l{m+ 1) for the right-hand side in Oil while 
choosing ;r > 0 according to 


Tti = C- 


!log(i-|- 1)2 ’ 


ieN, 


c := 


y _ \ _ 

,|^iTog(i-bl)2 


-1 


shows that, for some u E lower bounds of the form E{\\u — jp) > Ca(m + 1) “ may 
hold for all m > 0 simultaneously, with any a > 1. However, for sequences K of the form 

;r,'= ieN, c, := > 

Vi'GN / 


with j > 0, slight improvements are possible. Note that for specific complete orthonormal sys¬ 
tems (such as the trigonometric system in F = l2 (T)) the classes for such K have natural 
interpretations as L^-Besov-Lipschitz spaces (in the case of our example this would be and 
5 corresponds to a smoothness parameter). Better rates can also be concluded if we assume that 
u belongs to a smaller class of this type with parameter s' > s. 

Although the validity of these observations heavily relies on the assumed orthogonality of 
the splitting, we believe that especially the randomized versions should be investigated further. 
In particular, improved convergence rates for special classes of space splittings (e.g., induced 
by multilevel frames and sparse grid spaces) are desirable, and the potential of randomization 
techniques for the development of new adaptive algorithms needs to be further evaluated. 
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